Top 10 AI Models in August 2024

ahmed sabag August 28, 2024

0 3 2 minutes read

Table of Contents

In terms of performance, OpenAI's language models are now competing with those developed by Google for its Gemini chatbot. At least that's what the recent update of the Chatbot Arenaa ranking designed by researchers and students from the prestigious University of Berkeley, in the United States, with the support of HuggingFace.

Updated in real time and assigning a performance score to each AI model, the Chatbot Arena aims to objectively prioritize text-generating AIs based on user contributions. Users are invited to evaluate, based on the same query, the responses provided by two AI models whose identities are hidden from them.

Top 10 Best Performing Language Models in August 2024

In July 2024, the company that created ChatGPT, aided by the deployment of GPT-4o mini, took five of the top ten positions in this ranking. In the process, it pushed most of its competitors out of the top 10, with the exception of Anthropic and Google. The Mountain View firm, which had been slightly behind until now, responded this month by launching Gemini 1.5 Flash: a model that is supposed to be more effective in its responses and which has been integrated into the free version of its eponymous chatbot.

Thanks to its cutting-edge technology, Google has therefore invited itself onto the podium of the Chatbot Arena, while also occupying several places of honor (6th, 8th and 9th). OpenAI, however, maintains a nice lead, placing two iterations on the podium and four in the top 10. In the rest of the ranking, Meta rises to seventh position with Meta Llama 3.1, while Anthropic loses two positions with Claude 3.5 Sonnet, its most advanced model.

ChatGPT 4o Latest : 1314 (Elo score)
Gemini 1.5 Pro 0801: 1297
GPT-4o 0513: 1286
GPT-4o mini 0718: 1274
Claude 3.5 Sonnet: 1271
Gemini Advanced: 1266
Meta Llama 3.1: 1263
Gemini 1.5 Pro: 1260
Gemini 1.5 Pro Preview: 1257
GPT-4 Turbo 0409: 1257

Access the full ranking

Chatbot Arena Ranking Criteria

Designed by the Large Model Systems Organization (LMSYS) which brings together American students and researchers, the Chatbot Arena uses the Elo system to rank generative models. But what is it, exactly? This evaluation system, used in several competitive disciplines such as esports or chess, has the advantage of being perfectly adapted to the duel principle on which the Chatbot Arena is based, as well as “to predict the outcome of the (next) match”according to LMSYS.

An Elo score can be thought of as a provisional rating, which changes based on performance. If, during a duel, a generative model with a high score, such as Gemini 1.5 Pro this month, suffers a defeat against a weaker opponent, such as Claude 3.5 Sonnet, it will lose points. Conversely, it will gain points if it wins against a supposedly stronger model.

The Elo system is also used by Artificial Analysis, the organization that offers a similar ranking for image-generating AIs, such as Midjourney or DALL-E.

ahmed sabag August 28, 2024

0 3 2 minutes read

Make It Easy to Buy from Your Site in 2024

Samsung Galaxy S21 FE gets search circle

EBay Income Opportunities.

5 Must-Haves of Website Design in 2024

Blog Software to Suit All of Your Blogging Needs in 2024

Meta plans to add displays to Ray-Ban smart glasses

Sony surprises PS5 players with a special free game that does not require a PS Plus subscription!

Samsung starts developing One UI 7.1

Samsung reveals its future plans for extended reality and smartphones

Uber to Accept Bitcoin and Cryptocurrencies as Payment Options Soon

Who Else Wants to Make Money with Adsense? in 2024

Sony surprisingly adds Final Fantasy 16 to PS Plus games library

Top 10 AI Models in August 2024

Top 10 Best Performing Language Models in August 2024

Chatbot Arena Ranking Criteria

Leave a Reply Cancel reply

Top 10 Best Performing Language Models in August 2024

Chatbot Arena Ranking Criteria

The website is accessible to all with free image credits

Generative AI, an opportunity or challenge for video creation?

Related Articles

Leave a Reply Cancel reply